ARIDA: An Arabic Interlanguage Database and Its Applications: A Pilot Study

نویسندگان

  • Anna Feldman
  • Ghazi Abuhakema
  • Eileen Fitzpatrick
چکیده

This paper describes a pilot study in which we collected a small learner corpus of Arabic, developed a tagset for errorannotation of Arabic learner data, tagged the data for error, and performed simple Computer-aided Error Analysis (CEA). Language Learner Corpora and Applications Learner corpora research uses the methods and tools of Second Language Acquisition (SLA) studies and corpus linguistics to gain better insights into authentic learner language at different levels – lexis, grammar, and discourse. One application of learner corpora is Contrastive Interlanguage Analysis (CIA), which involves two types of comparison – 1) native speech (NS) vs. non-native speech (NNS) to highlight the features of nativeness and non-nativeness of learner language; 2) two or more varieties of NNS to determine whether non-native features are limited to one group of non-native speakers (in which case it is most probably a transfer-related phenomenon), or whether they are shared by several groups of learners with different mother tongue backgrounds (which would point to a developmental issue). Another application is Computer-aided Error Analysis (CEA) for identifying the sources of error (L1 interference, features of novice writing in the new culture, limited vocabulary and language structure, etc.). For this application, error annotation is essential.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents

Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...

متن کامل

The production of Yarrowia lipolytica lipase powder by improved spray-drying method

Lipase is used in the production of foods, flavor enhancers, detergents, cosmetics and pharmaceuticals. A common impediment to the production of commercial enzymes is their low-stability aqueous solutions. In this study, the downstream process was investigated to obtain a stable spray-dried lipase powder of Yarrowia lipolytica. The enzyme solution samples were supplemented with different concen...

متن کامل

Error Annotation of the Arabic Learner Corpus - A New Error Tagset

This paper introduces a new two-level error tagset, AALETA (Alfaifi Atwell Leeds Error Tagset for Arabic), to be used for annotating the Arabic Learner Corpora (ALC). The new tagset includes six broad classes, subdivided into 37 more specific error types or subcategories. It is easily understood by Arabic corpus error annotators. AALEETA is based on an existing error tagset for Arabic corpora, ...

متن کامل

On the Variability of Interlanguage

The study of second language acquisition involves many aspects, among which interlanguage is an important one. Research on interlanguage and its various characteristics may make a great difference in the study of second language acquisition. This paper tends to explore one of the characteristics of interlanguage, its variability, so as to study its role in second language acquisition.

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008